TREC 2002 Cross-lingual Retrieval at BBN

نویسندگان

  • Alexander M. Fraser
  • Jinxi Xu
  • Ralph M. Weischedel
چکیده

Two sets of parameters are important in the retrieval model. One is the translation probabilities P(tq|td). In TREC 2001, we used model 1 of Brown’s statistical MT work (Brown et al, 1993) for estimating term translation probabilities from a parallel corpus due to efficiency considerations. With more computer power at disposal, for TREC 2002 we used the more complex but potentially more accurate model 4 for the same purpose. Differences between the two models were discussed by Brown et al, 1993.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TREC 2001 Cross-lingual Retrieval at BBN

BBN only participated in the cross-lingual track in TREC 2001. Arabic, the language of the TREC 2001 corpus, presents a number of challenges to both monolingual and crosslingual IR. First, many inflected Arabic words can correspond to multiple uninflected words, requiring context to disambiguate them. Second, orthographic variations are prevalent; certain glyphs are sometimes written as differe...

متن کامل

English-Chinese Cross-Lingual Retrieval Using a Translation Package

Using a COTS English-Chinese bidirectional translation software package together with our PIRCS bilingual retrieval system, we performed English-Chinese cross-lingual retrieval experiments using the TREC Chinese collections and queries. With some simple approaches, we are able to attain effectiveness about 67% of the monolingual Chinese results.

متن کامل

Cross-lingual Information Retrieval Using Hidden Markov Models

This paper presents empirical results in cross-lingual information retrieval using English queries to access Chinese documents (TREC-5 and TREC-6) and Spanish documents (TREC-4). Since our interest is in languages where resources may be minimal, we use an integrated probabilistic model that requires only a bilingual dictionary as a resource. We explore how a combined probability model of term t...

متن کامل

IIT at TREC-10

For TREC-10, we participated in the adhoc and manual web tracks and in both the site-finding and cross-lingual tracks. For the adhoc track, we did extensive calibrations and learned that combining similarity measures yields little improvement. This year, we focused on a single highperformance similarity measure. For site finding, we implemented several algorithms that did well on the data provi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002